48 research outputs found
Preliminary Work on Speech Unit Selection Using Syntax Phonology Interface
This paper proposes an approach which uses a syntax-phonology interface to select the most appropriate speech units for a target sentence. The selection of the speech units is done by constructing the syntax-phonology tree structure of the target
sentence. The construction of the syntax-phonology tree is adapted from the example-based parsing of UTMK machine translation
Building An Ontology-Based Multilingual Lexicon For Word Sense Disambiguation In Machine Translation.
Word sense disambiguation (WSD) requires the establishment of a list of the different meanings of words. WSD efforts in machine translation require) in addition) the equivalent translation words in target languages
Lost in Translation: Word Sense Disambiguation.
In natural languages, a word can take on different meanings in different contexts. Word sense disambiguation (WSD) refers to the task of determining the correct meaning or sense of a word in context
A Synchronization Structure Of SSTC And Its Applications In Machine Translation.
In this paper, a flexible annotation schema called (SSTC) is introduced. In order to describe the correspondence between different languages, we propose a variant of SSTC called synchronous SSTC (S-SSTC). We will also describe how S-SSTC provides the flexibility to treat some of the non-standard cases, which are problematic to other synchronous formalisms
Digitising Dictionaries For Advanced Look-Up And Lexical Knowledge Research In Malay.
Electronic dictionaries need not be mere OCR digitised versions of their paper-form counterparts: they can be made more computer-tractable to facilitate more meaningful operations and data exchange. For instance, explicitly annotating different fields in a dictionary entry allows more targeted look-ups, as we will show using Kamus Dewan as an example. Dictionary data can also be reorganised to enable semantic base search. The wordnet lexical database is one such model, for which we created a prototype for the Malay language. As both the proposed annotated Kamus Dewan and Malay WordNet are compiled according to established standards and guidelines, the data can be aligned with similar lexical resources of other languages. This provides a means for mutual sharing, interchange and enrichment of lexical data and knowledge between Malay and other languages
Porting SIMRJGSA Algorithms For Mapping And Alignment To Malay-English Bitexts.
Parallel texts or Bitexts - where the same content is available in several languages, due to document translation, are becoming plentiful and available, both in private data warehouses and on publicly accessible sites on the WWW
Building A Semantic-Primitive-Based Lexical Consultation System.
The paper describes the design of semantic primitive-based lexical consultation system and the possible processes which will be performed on a machine-readable dictionary (MRD) and
corpus to produce a machine-tractable dictionary